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(57) Abstract: Methods using gel electrophoresis and mass spectrometry for the rapid, quantitative analysis of proteins or protein 
function in mixtures of proteins derived from two or more samples in one unit operation are disclosed. In one embodiment the 
method includes (a) preparing an extract of proteins from each of at least two different samples; (b) providing a set of substantially 
chemically identical and differentially isotopically labeled protein reagents, one for each sample; (c) reacting each protein sample of 
step (a) with a different reagent from the set of step (b) to provide isotopically labeled proteins; (d) mixing each of said isotopically 
labeled proteins to form a angle mixture of different isotopically labeled proteins; (e) electrophoresing the mixture of step (d) by an 
electrophoresing method capable of separating proteins within said mixture; and (f) detecting the difference in the expression levels 
of the proteins in the two samples by spectrometry based on individual peptides derived from chemical or enzymic digestion. The 
analytical method can be used for qualitative and particularly for quantitative analysis of global protein expression profiles in cells 
and tissues, i.e. the quantitative analysis of proteomes. . 
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PROCESS FOR ANALYZING PROTEIN SAMPLES 

FIELD OF TEffi INVENTION 

This invmtion relates to a process for detecting differences in protein cona^position 
between conoplex protein samples such as cell lysates, cell extracts, or tissue extracts. More 
particularly this invention relates to a process for analyzing protein conq)ositions using gel 
electrophoresis utilizing at least two labeled reagents capable of detecting such differences. 

BACXCSROUND OP THE INVENTION 

Two dimensional (2D) electrophoresis has long been a mainstay in the quantitative 
analysis of complex mixtures of proteins, as JGbom cell lysates or organelles^ The traditional 
approach for quantiiying proteins is to perform image analysis of the gels. The proteins can 
be detected by staining the proteins, by autoradiography, or even by using antibodies specific 
for certain proteins (W estem blotting). Although powerful software has been developed to 
quantify the amount of protein that migrates to a spot in a gel, there is a limit to how much 
information can be obtained by such analyses even if the gels are perfectiy reproducible and 
even if the software for spot analysis is able to resolve ambiguities of overl^piag spots and 
uneven backgrounds. Recently, mass spectrometric techniques were described in published ' 
PCT Ihtemational AppUcation WO 00/1 1208 in which stable isotopes are incorporated into 
peptides derived fiom each proteins that bypasses the need for gels and for image analysis of 
any kind, because quantitation is performed by a mass spectrometer. However, when proteins 
are digested ahead of time, almost all information relating to protdn chemical modification is 
lost, and1fae quantitative infi)imation for different proteins that s^^ . 
detected is combined together. 

Proteins are essential for the control and execution of virtually every biological 
process. The rate of synthesis and the half-hfe of proteins and thus their expression level are 
also controlled post-transcriptionally. Furthermore, the activity of proteins is frequentiy 
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modulated by post-translational modifications, in particular protein phosphor^tion, and 
dependent on the association of the protein with other molecules including DNA and proteins. 
Neither the level of expression nor the state of activity of proteins is therefore directiy 
apparent firom the gene sequence or even the expression level of the corresponding mRNA 
transcript. It is therefore highly desirable that a complete description of a biological system 
include measurements that indicate the identity, quantity and the state of activity of the 
proteins which constitute the system. The large-scale (ultimately global) analysis of proteins 
e^qnressed in a cell or tissue has been termed proteome analysis. Proteome analysis permits 
the detection and monitoring of differences in cell structure, function and development The 
capability of determining differences in protein content between normal cells and abnormal 
cells such as canc^us cells is a valuable diagnostic tool. 

At present no protein analytical technology approaches the thiou^ut and level of 
automation of presently available genomic technology. The most common implementation of 
proteome analysis is based on the separation of complex protein samples most commonly by 
2D gel electrophoresis (2DE) and the subsequent sequential identification of the separated 
protein species, typically by mass spectrometry. This approach has been revolutionized by the 
development of powerful mass spectrometric techniques and the development of computer 
algorithms which correlate protein and peptide mass spectral data with sequence databases 
and thus rapidly and conclusively identify proteuis. This technology has reached a level of 
sensitivity which now permits the identification of essentially any protein which is detectable 
by conventional protein staining methods including silver staining. In the 2DE / NfS" method, 
proteins are quantified by densitometry of stained spots in the 2DE gels, followed by mass 
spectrometiy (MS), tandem mass spectrometry (MSMS or MS^), or multiple rounds of mass 
spectrometry (MS) °. Alternatively, tiie staining step can be omitted, and the proteins can be 
detected by mass spectrometry, for example, by analyzing extracts of every slice fiom a ID 
gel, or fiom every piece of a 2D gel, or by scaiming membranes onto which digests from such 
gels have been deposited by transblotting (Bienvenut et al.. Anal. Chem. 71:4800-4807, 
1999). 
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Li gel electrophoresis, proteins can be separated into individual components according 
to differences in mass by electrophoresing a protein mixture in a polyacrylamide gel under 
denaturing conditions. One dimensional and two dimensional gel electrophoresis have 
become standard tools for studying proteins. One dimensional SDS (sodium dodecyl sulfate) 
electrophoresis through a cylindrical or slab gel reveals only the major proteins present in a 
sample tested Two dimensional polyacrylamide gel electrophoresis (2D PAGE), which 
separates proteins by isoelectric focusing, i.e., by charge, in one dimension and by size in the 
second dimension, provides higher resolving power, which is important when tiiere are many 
proteiiis in flie sample. The proteins migrate in one-or two-dimensional gels as bands or spots 
respectively. Theseparatedprotemsare visualized by a variety of methods, such as by 
staining vnUi a protem specific dye, by protein mediated silver precipitation, autoradiographic 
detection of radioactively labeled protein, and by covalent or non-covalent attachment of 
fluorescent compounds. Immediately following the electrophoresis, tiie resulting gel patterns 
may be visualized by eye, photographically or by electronic image capture, for example, by 
using a cooled charge-coupled device (CCD). To compare samples of proteins from different 
cells or different stages of cell development by conventional methods, each different sample is 
presently run on separate lanes of a one dimensional gel or separate two dimensional gels. 
Comparison is by visual examination or electronic imaging, for sample, by computer-aided 
image analysis of digitized one or two dimensional gels. The goal of such research is often to 
determine which protdns out of the hundreds of proteins that can be detected have changed in 
expression level between a control sample and one or more experimental samples. 

Two dimensional gel electrophoresis has been a powerful tool for resolving complex 
mixtures of proteins. The differences in migration between the proteins, however, can be 
subtle. Imperfections in the ^1 can interfere with accurate observations. In order to minimize 
the imperfections, the gels provided in commercially available electrophoresis systems are 
prepared with exacting precision. Even with meticulous controls, no two gels are identical. 
The gels may differ one from the other in pH gradients or uniformity. In addition, the 
electrophoresis conditions from onie run to ttie next may be different. Computer software has 
been developed for automated aUgmnent of different gels. However, all of the software 
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packages aie based on linear expansion or contraction of one or both of the dimensions on two 
dimensional gels. The software has difELculty adjusting for local distortions in the geb^ The 
ideal way to overcome such limitations is to combine the two samples prior to gel 
electrophoresis, assuming the two samples can be distinguished jfrom one another at the 
analysis stage. 

It has been proposed m U.S. Patents 6,043,025 and 6,127,134 to provide a process for 
analyzmg protem compositions from at least two samples wherein one sample is stained with 
a first dye and a second sample is stained with a second dye. The samples tiien are sepaxated 
either by a ID or 2D gel electrophoresis process to effect protem separation into a plurality of 
spots. A spot of interest then is analyzed to determine tiie difference in luminescent intensity 
of the dyes thereby to determine protem concentration from each sample. The camera is able 
to distinguish between the two dyes by tiie wavelengths of the miitted ligjit, although dynamic 
range can be con^romised due to a small amount of spectral overlap between the dyes. For 
this quantitation to be precise, the two species of proteins must migrate to acactiy the same 
spot, ideally the same position as the unmodified protein. In some instances, only a small 
proportion of the protein is initially stained with the dyes. If there is any separation of stained 
fix>m unstained proteins, then some fluorescent proteins may co-migrate with unrelated 
unstained proteins, resulting in misleading identifications in cases in which the protein is 
identified post electrophoresis. 

The development of methods and instrumentation for automated, data-dependent 
electcospray ionization (ESI) tandem mass spectrometcy (MS°) in conjunction with 
microcapillaiy liquid chromatography (jjLC) and database searching has significantiy 
increased tiie sensitivity and speed of the identification of gel-separated proteins. As an 
alternative to the 2DE / MS" £q[>proach to proteome analysis, the direct analysis by tandem 
mass spectrometry of peptide mixtures generated by the digestion of complex proteui mixtures 
has been proposed (Ducret et al., Prot. Sci. 7:706-719,1998). Tandem |iLC/MSMS has also 
been used successfiilly for the large-scale identification of individual proteins directiy fix)m 
mixtures without gel electrophoretic separation (Yates et al., Methods Mol. Biol., 146: 17-26, 
2000; Lmk et aL, Nat BiotechnoL 17:676-82, 1999; Opitek et al.. Anal. Chem. 64: 1518- 
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1524, 1997). While these ^roaches dramatically accelerate protein identificatioi], the 
absolute or relative quantities of fbe analyzed proteins cannot be easily determined, and these 
methods have not been shown to substantially alleviate the dynamic range problem also 
encountered by the 2DE / MSMS approach (Gygi et al., Proc. Natl. Acad. Sci. USA 17:9390- 
5, 2000). Therefore, low abundance proteins in complex samples are also difficult to analyze 
by the jiLC/MSMS method without their prior enrichment. 

An altemadve to quantifymg proteins in complex mixtures after SDS PAGE or 2D 
PAGE on the basis of staining intensity using conventional protein stains or fluorescent stains 
is to use protein stains to localize the regions of interest Following proteolytic digestion, the 
peptides may then be labeled with stable isotopes, for example with deuterated 
nicotinoyloxysuccinimide (Munchbach, Quadroni, Miotto and James, Anal. Chem. A, 2000), 
which allows mass spectrometry to be used for quantitation. This approach suffers from the 
drawback that the proteui ratio obtained is dependent on how carefully the spots are excised 
from the gel. Also, the control and the experimental sample must be run on separate gels. 

Alternatively, isotopically labeled amino acid precursors maybe introduced 
specifically into one of the two samples prior to proteolytic digestion (Sechi and Chait, Anal. 
Chem., 24:5150-8, 1998, Chen, Smith and Bradbury, Anal. CbGm. 72: 1 134-1 143, 2000). 
This s^proach suffers from the drawback that the proteins must be isolated from culture 
conditions that allow close to complete replacement of the unlabeled amino add precursors by 
the labeled precursors, or the intensity of each peptide will be spread out over a larger isotope 
cluster than usual, compromising both sensitivity and quantitation. 

Recently, an approach was developed involving isotope coded afiBnity tags (ICAT™) 
that combines the incorporation of stable isotq)es into the cysteine-containing peptides of 
proteins witii the ability to affinity purify these modified peptides and to subsequentiy detect 
the proteins bymass spectrometry (Gygi etal., Nat BiotechnoL, 17:994-9, 1999). Reagents 
usefiil in carrying out this method are commercially available Aom Applied Biosystems 
(Foster City, CA) xmder the ICAT™ brand. Because protems typically have a small number 
of cysteine residues, it becomes possible to identify large numbers of proteins by focusing on 
a small subset of the peptides that are generated upon proteolytic digestion, making it possible 
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to penetrate further into fho proteome without being overwhehned by large numbers of 
peptides fiom the most abundant proteins. Because the quantitation is performed by mass 
spectrometry, two or more samples can be combined together prior to analysis, so that 
artifactual sample processing differences do not afTect the results so long as they take place 
after cysteine modification. 

There are, however, several limitations to the previously described ICAT reagent 
based technology that in certain cases limit the information that can be obtained from the 
experiment The cysteine containing peptides should be sufficiently long to uniquely identify 
proteins (or classes of homologous proteins). Because each peptide is separately purified, 
MS*^ techniques are often used to identify the protem fix)m which the peptide was derived, 
instead of the simple pq>tide mass fingerprinting (PMF) technique. No information is 
retained about the intact molecular weight of the protein(s) fiom which the cysteine- 
contaming peptide was derived, or wheflia: the protem was chemically modified by 
phosphorylation. Finally, no information is obtained from proteins that do not contain 
cysteine. 

The present invention combines mass spectrometric quantitation with the resolving 
power of 2D electrophoresis so that differences in protein compositions bom two or more 
saixKples containing complex mixtures can be determined from a single 2D gel. This 
extoision to the current state of ICAT reagent technology overcomes each of tiie foregoing 
limitations. Proteins are modified by usmg the same ICAT reagent technology as before. 
However, all the advantages of protein sqparation by 2D gels are preserved. Although 
analysis of tiie ICAT reagent labeled pq)tides themselves usually leads to no information 
about the chemical modification of the protein from which they dmved, the position of the 
protein on the gel is indicative of whether the protein was modified. Also, the chemically 
modified peptides themselves are present in the same spot, thus the ICAT reagent labeled 
peptides can still be used for quantitation of the relative amounts of each of the modified 
species. In addition, ICAT reagent containing peptides of any length are now informative 
because any one spot contains very few proteins. This also makes it possible to use PMF to 
identify the proteins, including any non-cysteine containing proteins that may be present at the 
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same spot on the gel. These techniques still allow simultaneoiis processing of two or more 
sanQ)les such as those obtained fix>m an experimental and a control sample. This same 
combination of technologies is also applicable to less resolving gel systems like ID SDS 
PAGE gel analysis, ID isoelectric focusing gels and the like. 

SUMMARY OF THE INVENTION 

This invention provides methods based upon ID and 2D gel electrophoresis and mass 
spectrometry for the rapid, quantitative analysis of proteins or protein function in mixtures of 
proteins derived fix>m two or more samples in one unit operatioa Thus, only one gel must be 
performed in order to deduce which proteins have changed in expression level between the 
experimental sample and the control sample because the quantitation is determined by mass 
spectrometry. The analytical method can be used for qualitative and particularly for 
quantitative analysis of global protem repression profiles in cells and tissues, Le. the 
quantitative analysis of proteomes. The method can also be employed to screen for and 
identify proteins whose expression level in cells, tissue or biological fluids is affected by a 
stimulus (e.g., administration of a drug or contact with a potentially toxic material), by a 
change in enviionm^t (e.g., nutrient level, temperature, passage of time) or by a change in 
condition or cell state (e.g., disease state, malignancy, site-directed mutation, gene knockouts) 
of the cell, tissue or organism from which the sample originated. The proteins identified in 
such a screen can function as markers for the changed state. For example, comparisons of 
protein expression profiles of normal and maUgnant cells can result in the identification of 
proteins whose presence or absence is characteristic and diagnostic of the malignancy. 

The methods herein can also be used to implemrat a variety of clinical and diagnostic 
analyses to detect the presence, absence, deficiency or excess of a given protein or protein 

function in a biological fluid (e.g.> blood),-or in cells or- tissue. -The method is particularly 

useful in the analysis of complex mixtures of proteins, i.e., those containing S or more distinct 
proteins or protein functions. This method can also be used to look for absolute, quantitative 
changes if specific calibrated standards are labeled. 
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As with the techniques described in the aforementioned published PCX patent 
application (WO 00/1 1208), the present mvention employs an isotopically labeled protein 
which can be either an afiGnity-labeled protein reactive reagent or non-a£Bnity labeled protein 
reactive reagent that allows for the selective isolation of peptide fragments from complex 
mixtures. First, the control and the experimmtal sample(s) are labeled separately with 
different isotopic variants of the ICAT reagent, and are then combined. Separation of the 
protein components of tiie two or more samples is effected by either ID or 2D gel 
electrophoresis followed by protein digestion. The isolated peptide fragments or reaction 
products are characteristic of the presence of a protein in tiiose mixtures. Isolated peptides are 
characterized by mass spectrometric (MS) techniques. The most abundant proteins may be 
identified by peptide mass fingerprinting. Alternatively, the sequence of isolated peptides can 
be detemiined usmg tandem MS (MS^ techniques, and by plication of presentiy available 
sequence database searching techniques, the protein from which the sequenced peptide 
originated can be identified. The reagents utilized in the process of this invention provide for 
differential isotopic labeling of the isolated peptides that facilitates quantitative determination 
by mass spectrometry of the relative amounts of proteins in different samples. Also, the use 
of differentially isotopically labeled reagents as internal standards of known concentration 
&cilitates quantitative determination of the absolute amounts of one or more proteins or 
reaction products present in the sanople. 

la general, the affinity labeled protem reactive reagents utilized in the process of this 
invmtion have three portions: an affuiity label (A) covalently linked to a protein reactive 
group (PRG) through a linker groiq> (L): 

A-Ir-PRG 

The linker may be differentially isotopically labeled, e.g., by substitution of one or more 
atoms in the linker with a stable isotope thereof. For example, hydrogen atoms can be 
substituted with deuterium atoms or with ^^C. 
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The nonraffinity labeled protein reactive reagents utilized in the process of this 
invention have two portions: a protein reactive group (PRG) and a linker group (L): 

L-PRG 

which are as deJEbaed above. 

The affinity label A functions as a molecular handle that selectively binds covalentiy 
or non-covalentiy, to a capture reagent (CR). Binding to CR facilitates isolation of peptides 
labeled with A. hi specific embodiments, A is a streptavidih or avidin. After afiGmity isolation 
of affinity tagged materials, some of which may be isotopically labeled, the interaction 
betwera A and the capture reagent is disrq)ted or broken to allow MS analysis of the isolated 
materials. The affinity label, when utilized, can be displaced £rom the cq)tuie reagent by 
addition of displacing ligand, which may be firee A or a derivative of A, or by changing 
solvent (e.g., solvent type or pH) or temperature conditions or the linker may be cleaved 
chemically, enzymatically, thermally or photochemically to release the isolated materials for 
MS analysis. 

The type of PRG group that is specifically provided herein include those groups that 
selectively react with a protein functional group to form a covalent or non-covalent bond 
tagging the protem at specific sites. In specific embodiments, PRG is a group having specific 
reactivity for certain protein groups, such as specificity for sulfhydryl groups, and is useful in 
general for selectively taggmg proteins in complex mixtures. A sulfhydryl specific reagent 
tags protems containing cysteme. 

Exemplary reagents useful m the jxrocess of this invention have the gmeral formula 
A- B*-X^-(CH2)d -lX^CH2)«l^X^^CH2)p-X*-B^-PRG 

where: 

A is optionally present and is the affinity label; 
PRG is the protein reactive groiq); 

X^*, X^, X^ and X*, indepmdentiy of one another, and X^ independentiy of oth^ X^ in 
flie linker group, can be selected firom O, S, NH, MR, NRR'^ CO, COO, COS, S-S, 
SO, SO2, CO«NR', CS-NR', Si-0, aryl or diaryl groups or X^-X^ maybe absent, but 
preferably at least one of X^-X^ is present; 
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and B^, independently of one another, are optional moieties that can &cilitate 
bonding of the A or PRG group to the linker or prevent undesiied cleavage of those 
groi5)s fix)m the linker and can be selected, for example, from COO, CO, CO-NR', CS- 
NR* and may contain one or more CH2 groups alone or in combination with other 
groups, e.g.(CH2)q-CONR', (CH2)q-CS-NR', or (CH2)q; 

n, m, p and q are whole numbers that can have values from 0 to about 100, preferably 

one of n, m, p or q is not 0 and x is also a whole number that can range fiom 0 to about 

100 where the sum of n+xm+p+q is preferably less than about 100 and more 

preferably less than about 20; 

R is an alk^ alkenyl^ allsynyl, alkoxy or ai^ group; and 

R' is ahydrogen, an alkyl, alken}d, alkynyU alkoxy or arjd group. 

One or more of the CH2 gfoxsps of the tinker can be optionally substituted with small 
(Ci-Ce) alkyl, alkenyl, or alkoxy groups, an aryl group or can be substituted with fimctional 
groups that promote ionization, such as acidic or basic groups or groups carrying permanent 
positive or negative charge. One or more single bonds coimecting CH2 groins in the linker 
can be replaced with a double or a triple bond. Preferred R and R* alkyl, alkenyl, alkynyl or 
alkoxy groups are small having 1 to about 6 carbon atoms. 

One or more of the atoms in Ihe linker can be substituted with a stable isotope to 
generate one or more substantially chemically identical, but isotopically distinguishable 
reagents. For example, one or more hydrogens in the linker can be substituted with deuterium 
to generate isotopically heavy reagents. 

In an exenq^lary embodiment the linker contains ffoxsps that can be cleaved to remove 
the aGBnity tag. If a cleavable linker group is employed, it is typically cleaved after afiBnity 
tagged pq)tides have hem isolated using the a£Bnity label together with the CR In this case, 
any isotopic labeling in the linker preferably remains bound to the protein or peptide. 

Linker groiq)s include among others: efliers, polyethers, ether diamines, polyether 
diamines, diamines, amides, polyamides, polyfhioethers, disulfides, silyl ethers, alkyl or 
alkenyl chains (straight chain or branched and portions of which may be cyclic), aryl, diaryl or 
alkyl-aiyl groxq)s. Aryl groups in Unkers can contain one or more heteroatoms (e.g., N, O or S 
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atoms). 

In one aspect, the inventicn provides a gel electrophoresis mass spectrometric method 
for identification and quantitation of one or more proteins in a complex mixture which 
employs afiBnity labeled reagents in which the PRG is a group that selectively reacts with 
certain amino acids or derivatives of amino acids that are typically found in proteins (e.g., 
sulfhydryl, amino, carboxy, homoserine lactone groiqps). Labeled reagents that optionally can 
contain an affinity label and with different PRG groiqps are introduced into a mixture 
containing proteins and the reagents react with certain proteins to tag them. &i each case, it is 
necessary either to obtain stoichiometric protein modification with the isotope labeled reagent, 
or to modify the isotope labeled reagent so that the protein migrates homogeneously on the gel 
system to be employed It may be necessaiy to pretreat the protein mixture to reduce disulfide 
bonds or otherwise fecilitate labeling. After reaction with the labeled reagents, the multiple 
samples are combined, preferably in equal amounts, and the proteins in the complex mixture 
sqjarated by either ID or 2D gel electrophoresis. The gel is then stained to reveal the location 
of the proteins. The area of the gel containing the protein mixture or mixtures of interest is 
then excised and cleaved, e.g., enzymatically, rato a number of peptides, or the gel is sliced 
uniformly so that all pieces can be analyzed. Alternatively, the proteins may be electroblotted 
to a membrane, and digestion performed on the membrane. As a third alternative, the proteins 
may be continuously eluted &om the bottom of the gel and collected as fiactions, followed by 
digestion. This digestion step may not be necessary, if the proteins are relatively small. After 
the pq)tides are purified, the piotein(s) may be identified by means of peptide mass 
fingerprinting (PMF). When utilizing a reagent labeled with an affinity label, peptides that 
remain tagged with the a£Bnity label are then isolated by an affinity isolation step, e.g., afBnity 
chromatogrs^hy, via their selective binding to the CR. Isolated peptides are released fix)m the 
CR by displacement of A or cleavage of the linker, and released materials are analyzed by 
liquid chromatogr^hy/mass spectrometry (LC/MS). When a non-affinity labeled reagent is 
utilized, this afiSnity isolation step is not effected. The sequence of one or more tagged 
peptides is then determined by MSMS techniques, if necessary, hi some cases, at least one 
pq>tide sequence derived fix>m a protein will be characteristic of that protein and be indicative 
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of its presence in the tnixtuie. In otiier cases, the isotopically labeled peptide may be too short 
to uniquely identify a ptotdn, and the use of PMF data may be necessary to identify the 
protein of origin. In other cases, the isotopically labeled peptides may be identical Miithin a 
family of closely related proteins, which can then be distinguished by PMF or by MSMS 
analysis of other peptides present in the mixture that are unique to specific proteins. Finally, 
the high resolving power of 2D gel electrophoresis makes it possible to distinguish between 
dUSermt chemically modified forms of the same protein coding sequence, even if these 
proteins overly in space with other unrelated proteins. Thus, the sequences of the peptides 
and the pq)tide mass fingerprint information together typically provide sufiEicient information 
to identify one or more proteins present in a mixture, even if the sequence of the isotopically 
labeled peptide is not sufficiently informative by itself. 

The relative amounts of proteins in one or more different samples containing protein 
mixtures (e.g., biological fluids, cell or tissue lysates, etc.) can be determined using 
chemically identical but differentially isotopically labeled reagents. These reagents may, but 
need not, contain an afBnity tag. In this method, each sample to be compared is treated with a 
different isotopically labeled reagent to label certain proteins therein. Tagged peptides 
originating fiom different samples are distinguished from one another by their mass, even 
&ough they have the same chemical composition. Peptides characteristic of tiieir protein 
origin are identified using MS or MS° techniques allowing idratification of proteins in the 
samples. The relative amounts of a given protein in each sample is determined by comparing 
relative abundance of the ions generated from any differentially labeled peptides origmating 
from that proteiiL The method can be used to assess simultaneously the relative amounts of 
known proteins that origmated in different samples. Further, since the method does not 
require any prior knowledge of die type of protems that may be present ui the samples, it can 
be used to identify proteins which are present at different levels in the samples examined. 
More specifically, the method can be appUed to screen for and idmtify proteins which exhibit 
differential expression in cells, tissue or biological fluids. It is also possible to determine the 
absolute amounts of specific proteins in a complex mixture. In this case, a known amount of 
mtemal standard, one for each specific protein in the mixture to be quantified, is added to the 
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sample to be analyzed. The internal standard is a peptide that is identical in chemical 
stnictuie to the labeled peptide to be quantified except that the internal standard is 
dififerentially isotopically labeled than the peptide to be quantified. The internal standard can 
be provided in the sample to be analyzed in other ways. For example, a specific protein or set 
of proteins can be chemically tagged with an isotopically labeled reagent. A known amount of 
this material can be added to the sample to be analyzed. Also, it is possible to quantify the 
levels of specific proteins in multiple samples in a single analysis (multiplexing). In this case, 
afBnity tagging reagents used to derivatize proteins present in different labeled pqitides &om 
different samples can be selectively quantified by mass spectrometry. 

The method of the present invention provides for quantitative measurement of specific 
protems in biological fluids, cells or tissues and can be supplied to determine gjobal protein 
expression profiles in different cells and tissues. The same general strategy can be broadened 
to achieve the proteome-wide, qualitative and quantitative analysis of the state of modification 
of proteins, by employing labeled reagents with differing specificity for reaction with 
modified amino acid residues. The method of tiiis invention can be used to identify low 
abundance proteins in complex mixtures and can be used to selectively analyze specific 
groups or classes of proteins such as membrane or cell surface proteins, or proteins .contained 
within organelles, sub-cellular firactions, or biochemical firactions such as irmnunoprecipitates. 
Further, these methods can be s^plied to analyze differences in expressed proteins in different 
cell states. For example, the methods herein can be enq)loyed in diagnostic assays for the 
detection of the presence or the absoice of one or more proteins indicative of a disease state, 
such as cancer. 

BRIEF DESCaUPnON OF THE DRAWINGS 

Figure 1 is an image of a 2D-gel onto which-five different standardproteins. 
had been loaded, witii insets of mass spectra showing the regions that contained ICAT™ 
reagent pairs in accordance with the present invention. Also listed is the ratio at which the 
protdns were mixed prior to electrophoresis, and the ratio that was obtained \spon 
measurement of the intensities of the ICAT reagent pairs. 
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Figure 2 is an expanded view of the spot for lactalbumin, segmented into 
quadrants. Also shown are the regions of a mass spectrum containing one ICAT reagent pair^, 
andtheintensityratio that was detennined for each of them in accordance with the pr^ , 
invention. 

Figure 3 is a set of mass spectra obtained ftom one fraction of a mixture of two 
lysates of E. coli that had been labeled separately with ICAT reagent prior to electrophoresis 
through a flow-through gel apparatus m accordance witii the present invention. The first panel 
shows the entire peptide mass fingeiprint that was obtained for one particular fraction afior 
digestion with trypsin, and the second panel shows flie peptides that were retained and eluted 
from avidin beads for this fraction. Two ICAT reagent pairs are shown in the insets. 

DESCRIPTION OF THE SPECMC EMBODIMENTS 

One aspect of this invention employs affinity tagged protein reactive reagents in which 
the afBnity tag is covalently attached to a protein reactive group by a linker or a reagent free of 
an affinity tag and which comprises a protein reactive group covalently attached to a linker. 
The linker is isotopically labeled to generate pairs or sets of reagents that are substantially 
chemically identical, but which are distinguishable by mass. For example a pair of reagents, 
one of which is isotopically heavy and the other of which is isotopically ligjit can be employed 
for the comparison of two samples one of which may be a reference san^le containing one or 
more known prt)teins in known amounts. For example, any one or more of the hydrog^ 
nitro^n, oxygen or suUur atoms in the linker may be replaced with their isotopically stable 
isotopes ^H, *^C, *H "O, ^**0 or ^S. 

When utilized, suitable affinity tags bmd selectively eitiier covalently or non- 
covalendy and with high affinity to a capture reagent (CR). The CR-A interaction or bond 
should rmiain intact after extensive and multiple washings with a variety of solutions to 
remove non-specifically bound components. The affinity tag binds minimally or preferably 
not at all to components in the assay system, except CR, and does not significantly bind to 
sur&ces of reaction vessels. Any non-specific interaction of the affinity tag with other 
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components or surfaces should be disrupted by multiple washes that leave CR-A intact. 
Further, it must be possible to disrupt the interaction of A and CR to release peptides, 
substrates or reaction products, for example, by addition of a displacing ligand or by changing 
the temperature or solvent conditions. Preferably, neither CR nor A react chemically with 
other components in the assay system and both groups should be chemically stable over the 
time period of an assay or experiment. The affinity tag preferably does not undergo peptide- 
like fragmentation during (MSf analysis. The affinity label is preferably soluble in the 
sample liquid to be analyzed and the CR should remain soluble in the sample liquid even 
though attached to an insoluble resin such as Agarose. In the case of CR^ the term soluble 
means that CR is sufficiently hydmted or otherwise solvated such that it functions properly for 
bmding to A. CR or CR-containing conjugates should not be present in the sample to be 
analyzed, except when added to capture A. 
Examples of A and CR pairs include: 

biotin or structurally modified biotin-based reagents, including iminobiotin, which 
bind to proteins of the avidin/streptavidin, which may, for example, be used in the forms of 
streptavidin-Agarose, oUgomeric-avidin-Agarose, or monomeric-avidia Agarose; 

any 1,2-diol, such as 1,2-dihydroxyethane (HO-CH2-CH2-OH), and ottier 1,2 
dihyroxyalkanes including those of cyclic alkanes, e.g., 1,2-dihydroxycyclohexane which bind 
to an alkyl or aryl boronic acid or boronic acid esters , such as phenyl B(OH)2 or hex]d-B(0 
Ethyl)2 which may be attached via the alkyl or aryl group to a solid siq)port material, such as 
Agarose; 

maltose which binds to maltose binding protein (as well as any other sugai/sugar 
binding protein pair or more generally to any ligand/ligand binding protein pairs that has 
properties discussed above); 

a hapten, such as dinitrophenyl group, for any antibody where the hapten binds to an 
anti-hapten antibody that recognizes the hapten, for example the dinitrophenyl group will bind 
to an anti-dinitrophenyi-IgG; 

a Ugand which binds to a transition metal, for example, an oligomeric histidine will 
bind to NiQT), the transition metal CR may be used in the form of a resin bound chelated 
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transition metal, such as nitrilotriacetic acid-chelated NI(ii) or iminodiacetic acid chelated 
Ni(n); 

glutathione which binds to glutathione-S-transferase. 

In general, any A-CR pair commonly used for affinity enrichmrat which meets the 
suitability criteria discussed above can be used. Biotin and biotin-based affinity tags are 
preferred. Of particular interest are structurally modified biotins, such as iminobiotin, which 
will elute firom avidin or streptavidin columns under solvent conditions compatible with ESI- 
MS analysis, such as dilute acids containing 10-20% organic solvent It is expected that 
iminobiotin tagged compounds will elute in solvents below pH 4. fininobiotin tagged protein 
reactive reagents can be synthesized by methods described herein for the corresponding biotin 
tagged reagents. Iq one preferred embodixhent, the a£Gnity ranichment medium consists of 
monomeric avidin, which has a lower afiBnity for biotin than tetrameric avidin, and thosfore 
can be recycled and used for the purification of peptides fix)m many firactions. 

A displacement ligand, DL, is optionally used to displace A firom CR. Suitable DLs 
are not typically present in samples unless added. DL should be chemically and enzymaticaUy 
stable in the sample to be analyzed and should not react with or bind to components (other 
than CR) in samples or bind non-speciBcally to reaction vessel walls. DL preferably does not 
undergo peptide-like fiagmratation during MS analysis, and its presence in sample should not 
significantly siq>press the ionization of tagged peptide, substrate or reaction product 
conjugates. DL itself preferably is minimally ionized during mass spectrometric analysis and 
the formation of ions composed of DL clusters is preferably minimal. The selection of DI^ 
depends xspon the A and CR gcoiq^s that are employed. In general, DL is selected to displace 
A firom CR in a reasonable time scale, at mo st within a week of its addition, but more 
preferably within a few minutes or iq) to an hour. The a£5nity of DL for CR should be 
con^arable to or stronger than the afGnity of the tagged compounds containing A for CR. 
Furthermore, DL should be soluble in the solvent used during the elution of tagged 
compounds containing A bom CR. DL preferably is firee A or a derivative or structural 
modification of A. Examples of DL include, biotin or biotin derivatives, particularly those 
containing groups that suppress cluster formation or suppress ionization in MS. 
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The linker group (L) should be sohible in flie sample liquid to be aiialyzed and it 
should be stable with respect to chemical reaction, e.g., substantially chemically in^t, with 
components of the sample as well as A and CR groups. The linker when bound to A should 
not interfere with the specific interaction of A with CR or interfere with the displacement of A 
fiom CR by a displacing ligand or by a change in temperature or solvent. The linker should 
bind minimally or preferably not at all to other components in the system, to reaction vessel 
surfaces or CR. Any non-specific interactions of the linker should be broken after multiple 
washes which leave the A-CR con^lex intact. Linkers preferably do not undergo pqptide-like 
fi:agmentation during (MSf analysis. At least some of the atoms in the linker groups should 
be readily rqplaceable with stable heavy-atom isotopes. The linker preferably contains groups 
or moieties that facilitate ionization of the afiGnity tagged reagents, peptides, substrates or 
reaction products. 

To promote ionization, the linker may contain acidic or basic groups, e.g., COOH, 
SO3H, primary, secondary or tertiary amino groiq)s, nitrogen-heterocycles, etiiers, or 
combinations of these groiQ)s. The linker may also contain groups having a permanent 
charge, e.g., phosphonium groups, quatemary ammonium groups, sulfonium groups, chelated 
metal ions, tetralkyl or tetraxyl borate or stable carbanions. 

The covalent bond of the linker to A or PRG should typically not be unintentionally 
cleaved by ch^nical or enzymatic reactions during the assay. ' In some cases it may be 
desirable to cleave the linker fix)m the afBnity tag A or fix>m the PRG, for example to fiicilitate 
release fiom an afiSnity column. Thus, the linker can be cleavable, for example, by chemical, 
thermal, enzymatic or photochemical reaction. Photocleavable groups in the linker may 
include tiie l-(2-nitrophenyl>-ethyl group. Thermally labile linkers may, for example, be a 
double-stranded diqilex fi)rmed fix>m two complementary strands of nucleic add, a strand of a 
nucleic add with a complementary strand of apeptide nucldc acid, or two complementary 
peptide nucldc aidd strands which will dissociate upon heating. Cleavable linkers also 
include those having disulfide bonds, acid or base labile groups, including among others, 
diarylmethyl or trimetiiylarylmethyl groups, silyl ethers, carbamates, oxyesters, thioesters, 
fhionoesters, and a^ha-fluorinated amides and esters. Enzymatically cleavable linkers can 
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contain, for exan[q)le, piotease-sensitive amides or esters, P-lactamase-sensitive ^-lactam 
analogs and linkers that are nuclease-cleavable, or glycosidase-cleavable. 

The protein reactive groi^ (PRG) can be a group that selectively reacts with certain 
protein functional groups. Any selectively reactive protein reactive group should react with a 
functional group of interest that is present in at least a portion of the proteins in a sample. 
Reaction of PRG with functional groups on the protein should occur under conditions that do 
not lead to substantial degradation of the conipounds in the sample to be analyzed. Examples 
of selectively reactive PRGs suitable for use in the a£Bnity tagged reagents of this invention 
include those which react with sulfhydryl groups to tag proteins containing cysteine^ tiiose that 
react with amino groups, carboxylate groups, ester gxnsps, phosphate reactive groiq)s, and 
aldehyde and/or ketone reactive groups or, after fragmentation witii CNBr, with homoserine 
lactone. 

Thiol reactive groiqps include epoxides, alpha-haloacyl group, nitrites, sulfonated alk^ 
or aryl thiols and maleimides. Amino reactive groups tag amino groups in proteins and 
include sulfonyl halides, isocyanates, isothiocyanates, active esters, including 
tetrafluorophenyl esters, and N-hydroxysuccinimidyl esters, acid halides, and acid anhydrides. 
In addition, amino reactive groups include aldehydes or ketones in the presence or absence of 
NaBH4orNaCNBH3. 

Caifoox^c acid reactive groups include amines or alcohols in the presence of a 
coiq)ling agent such as dicyclohexylcaibodiimide, or 2,3,5,6-tetrafIuorophen^ trifluoroacetate 
and in the presence or absmice of a coupling catalyst such as 4-dimetiiylaimnopyridine; and 
transition metal-diamine complexes including OiQI) phenantiiroline 

Ester reactive groins include amines which, for exam|>le, react with homoserine 
lactone. 

Phosphate reactive groups include chelated metal where the metal is, for example 
Fe(III) or Ga(III), chelated to, for example, nitrilotriacetic acid or iminodiacetic acid. 

Aldehyde or ketone reactive groups include amine plus NaBHU or NaCNBHa, or these 
reagents after first treating a carbohydrate with periodate to generate an aldehyde or ketone. 
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The reqidrements discussed above for A, L, PRG» extend to the coiresponding.to the 
segments of A-L-PRG and flie reaction products generated with Ms reagent. 

Memal standards, which are ^propriately isotopically labeled, may be employed in 
the methods of this invention to measure absolute quantitative amoimts of proteins in samples. 
These may be prq)ared by reaction of affinity labeled protein reactive reagents with a 
preparation known to contain the protem of interest to generate the affinity tagged peptides 
generated from digestion ofthe tagged protein. Alternatively, the desired peptides may be 
chemically synthesized. AfSnity tagged p^tide internal standards are substantially chemically 
identical to the corresponding affinity tagged peptides generated from digestion of the affinity 
tagged protein, except that they are differentially isotopically labeled to allow their 
independent detection by MS techniques. 

The method of this invmtion can also be ^lied to determine the relative quantities of 
one or more proteins in two or more protein samples, while shnultaneously deteraiming &eir 
identity. The protems in each sample are reacted with the labeled reagents which are 
substantially chemically identical but difiGErtsntially isotopically labeled. The samples are 
combined and processed as one, and then run together by gel electrophoresis. The proteins 
contained in specific bands or spots are then digested. Alternatively, after mixing the protem 
samples, but prior to electrophoresis, the proteins may be subjected to avidin affinity 
chromatogr^hy to enrich for biotinylated proteins, which could be important, for example, if 
mtact cells had bem labeled. The relative quantity of each labeled peptide, which reflects the 
relative quantity ofthe protein from which the peptide originates, is determined by the 
measurem^t of flie respective isotope peaks by mass spectrometiy. 

The methods of this invention can be sqyplied to the analysis or con:q)arison of multiple 
different sanq>les. Samples that can be analyzed by mettiods of this invention include cell 
homogenates; cell fractions; biological fluids including urine, blood, and cerebrospinal fluid; 
tissue homogenates; tears; feces; saliva; lavage fluids such as lung or peritoneal lavages; 
mixtures of biological molecules including proteins, lipids, carbohydrates and nucleic acids 
generated by partial or complete fractionation of cell or tissue homogenates. 
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The meHiods of this invention employ MS and (MSf methods. While a variety of MS 
and (MS)" are available and may be used in these methods;Matrix Assisted Laser Desorption 
Ionization MS (MALDI/MS) and Electrospray Ionization MS (ESl/MS) methods are 
preferred. 

As set forth above, the proteins in each sample are labeled with eithCT an (A) afiSnity 
labeled or a non-affinity labeled reagent both of which include a labeled linker moiety (L) and 
a protein reactive group (PRG). 

The labeled samples are mixed and then preferably subjected to 2D PAGE. One 
dimensional SDS electrophoresis can be used instead of 2D PAGE, or one dimensional 
isoelectric focusing gels, or any other electrophoretic method for separating proteins, 
including native protein electrophoresis. The procedures for running one dimensional and two 
dimensional electrophoresis are well known to those skilled in the art 

Proteins that the two cell samples have in common form coincident spots upon protein 
staining, or iq)on direct MS analysis of a piece of the gel. The ratio of the detectable isotopes 
between identical proteins from either sample will be constant for the vast majority of 
proteins. Proteins that the two samples do not have in common will migrate independently. 
Thus, a protein that is unique or of different relative concentration to one sample will have a 
different ratio of detectable isotopes from the majority of protein spots. The protein spots of 
interest then are digested to form labeled pq)tides which then are analyzed by (MS) °. 

In conventional analysis, a control is run with known proteins for the cell type being 
studied The known spots on the sample gel have to be identified and marked, then conoipared 
to the control and the second gel to determine differences betweoni the two gels. In the present 
invention, there is only one gel so no maridng is necessary. In addition, the software used on 
conventional processes for alignment of different gels prior to comparing and contrasting 
protein differences does not correct for local distortions and mconsist^des between two or 
more gels. The process of the present invention eliminates the need for such correction 
because the extracts for all samples to be tested are mixed and run on the same gel. Any gel 
distortions are e^erienced equally by each sample. 
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One of the advantages of performiag gel electrophoresis is that proteins of particular 
interest migrate to a reproducible place on the gel, so that if desirec^ only these proteins need 
be analyzed These proteins can include disease markers as well as control proteins. Many of 
the post-translationally modified forms of these proteins can be separated from one another by 
gel electrophoresis, so that the methods of the invention could be used to determine and 
quantify changes in the expression of each of these modified forms. If there was any difficulty 
in localizing such proteins, a small portion of the separated samples could be transblotted 
from file gel and these proteins could be located by immunoblotting techniques.. 
Alternatively, a small amount of the protem of interest could be labeled with a fluorescent 
maiker known not to affect migration position prior to electrophoresis to identify the regions 
of interest to be analyzed. Then the metiiods of this invention could be used to measuro the 
quantitative changes in the majority of tiie proteins in the gel based upon the PRG as a 
fimction of tiidr migration on the gel. 

The metihod of this invention can be utilized to analyze the protem composition 
described in Published PCT q>plication WO 00/1 1208 which is incorporated herein by 
reference. 

Quantitative Proteome Analysis w ith AffiTii ty Labeled Reagent 

This method consists of using a biotin labeled sulfliydryl-reactive reagent for 
quantitative protein profile measurements in a sample protein mixture and a reference protein 
noixture. The method comprises the following steps: 

A. Reduction Disulfide bonds ofprotdns in the sample and reference mixtures are 
reduced to free SH groiqps. The preferred reducing agent is tri-n-butylphosphine which is 
used und^ standard conditions. Alternative reducing agents include 
tricaxboxyediylphosphine, merc^toetfaylamine and dithiothreitol. If required, this reaction 
can be performed in the presence of solubilizing agents including high concentrations of urea 

- and detergents to maintain protein solubility: The roference and sample protein mixtures to be 
conq>ared are processed sq)arately, applying identical reaction conditions. 

B. Derivatization of SH groups w ith an affinity tap; Free SH groups are dadvatized 
wifli the biotinylating reagent biotinjd-iodoacetylamidyM,7» dioxadecanediamine. The 
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reag^ is prepared in different isotopically labeled forms by substitution of linker atoms witti 
stable isotopes and each sample is dmvatized with a different isotopically labeled form of the 
reagent Derivatization of SH groups is preferably performed under slightly basic conditions 
(pH 8.5) for 90 minutes at room temperature. For the quantitative, comparative analysis of 
two samples, one sample each (termed reference sample and sample) are derivatized with the 
isotopically light and the isotopically heavy form of the reagent, respectively. For the 
comparative analysis of several samples one sample is designated a reference to which the 
other samples are related to. Typically, the reference sample is labeled with the isotopically 
heavy reagent and the experimental samples are labeled with the isotopically Ught form of the 
reagent although this choice of reagents is aibitraiy. These reactions are also compatible with 
the presmce of high concentrations of solubilizmg agents. 

C. Combination of labeled samples After completion ofthea£Gnity tagging reaction 
defined aliquots of the samples labeled with the isotopically diffoient reagents (e.g., heavy and 
light reagents) are combined and all the subsequent steps are performed on the pooled 
samples. Combination of the differentially labeled samples at this early stage of the procedxire 
eliminates variabiKty due to subsequent reactions and manipulations. Preferably equal 
amounts of each sample are combined; and then fractionated by one of the following well 
known techniques: 

U Flow Through Gel electrophoresis The labeled proteins are separated through a 
preparative flow-through SDS gel (5%) apparatus (Mmi Prep Cell, Bio-Rad) and 
the eluted protems are collected m fractions. The proteins may be concentrated, for 
example, by acetone precipitation before proteolytic digestion is effected by 
overnight incubation with an enzyme such as trypsin. 

2.) Standard gel electrophoresis The gel maybe stained for piotdns to localize spots 
or bands, or tiie spots or slices may be processed without protein detection at this 
stage. Protein mixtures that are present in a spot (2D) or bimd (ID) by gel 
electrophoresis are excised from the gel, optionally dried and digested with an 
enzyme. The proteins in the sample mixture are digested, typically with trypsm. 
Alternative proteases are also compatible with the procediire as in &ct are 
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chemical firagmentation procedures. This step may be omitted in the analysis of 
small proteins. 

3.) Standard pel electrophoresis with digestion and traasblotting for peptide extraction 
. The gel may be treated with enzymes and transblotted (with or without the aid of 
electric curreat) onto a membrane, or transblotted through an active protease 
membrane and captured on a second membrane (Bienvenut et al.. Anal. Chem. 
71:4800-4807, 1999). That membrane can then be directly analyzed by MS or 
MALDIMSMS. 

D. Peptide Mass Fingerprinting The protein digest may then be submitted to PMF to 
identify the major protein conqponents. In &vorable instances, the Cys-containing 
biotin)dated peptides are detectable at fliis stage as isotope pairs that are 8 amu apart, and the 
relative amount of the protems can be determined by comparing tiie intensities of these 
peptides in the mass spectrum without additional purification. 

E. AfBnitv isolation of the aflBnitv tagged peptides by interaction with a capture 
reagent The biotinylated peptides may then be isolated on avidin-agarose. After digestion 
the pH of the peptide samples is lowered to 6.5 and the biotinylated peptides are immobilized 
on beads coated with monomericavidin(Promega). The beads are extensively washed The 
last washing solvent includes 10% acetonitrile to remove residual SDS. Biotinylated pq)tides 
are eluted fiom avidin-agaiose, for examplei, with 0.4% trifluoroacetic in tiie presence of 
acetonitrile. 

Analysis of tiie isolated, derivatized pqptides may also be accomplished by |iL&MS° 
or CE-MS" witii data dependent firagmentation. Methods and instrumrat control protocols 
well-known m the art and described, for example, in Ducret et al., 1998; Frot.Sci. 7: 706-719, 
Figeys and Aebersold, 1998 Electrophoresis 19: 885-892; Figeys et al.; 1996, Nature Biotech. 
14:1579-1583; orHaynes et al., 1998 Electrophoresis 19:939-945 are used and which are 
incoiporated herein by reference, hi this last step, both tiie quantity and sequence identity of 
the proteins &om which the tagged peptides origLoated can be determined by automated 
multistage MS. This is achieved by tiie operation of the mass spectrometer in a dual mode in 
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which the instrument alternates in successive scans between measuring the relative quantities 
of peptides eluting fiom the c£^illary column and recordihg the sequence iiofonnation of 
selected peptides. Peptides are quantified by measuring in tiie MS mode the relative signal 
intensities for pairs of peptide ions of identical sequence that are tagged with the isotopically 
light or heavy forms of the reagent, respectively, and which, therefore, differ in mass by the 
mass differential encoded within the afiSnity tagged reagent. Peptide sequence information is 
automatically generated by selecting peptide ions of a particular mass-to-charge (m/z) ratio for 
collision-induced dissociation (CID) in the mass spectrometer operating in tiie MS" mode. 
See Link, A.J. et al. Electrophoresis 18:1314-1334, 1997; Gygi, S JP. et al. MoLCelL Biol. 
19:1720-1730, 1999, and Gygi, S J. et al. Electrophoresis 20:310-319, 1999 and which are 
incorporated herein by reference. The resulting CID spectra are then automatically correlated 
witii sequence databases to identify tiie protein fiom which the sequenced peptide originated. 
The combination of the results generated by MS and MSMS analyses of afBnity tagged and 
differentially labeled peptide samples determines the relative quantities as well as the 
sequence identities of the components of protdn mixtures in a single, automated operation. 

This method can also be practiced using other afiSnity tags and other protein reactive 
groups, including amino reactive groups, carboxyl reactive groups, or groups that react with 
homoserine lactones. 

The approach employed herein for quantitative proteome analysis is based on two 
prindples. First, a short sequence of contiguous amino acids from a protein (5-25 residues) 
contains sufiBcient information to uniquely identify that protein. Protein identification by MS° 
is accomplished by correlating the sequrace information contained in the CE> mass spectrum 
witii sequence databases, using sophisticated computer searching algorithms (Bng, J. et aL J. 
Amer; Soc. Mass SpectronL 5: 976-989, 1994; Mann, M, et aL Anal. C3iem. 66: 4390-4399, 
1994; Qm, J. et aL Amer. Chem. 69: 3995-4001, 1997; Clauser, KJL et al. Proc. Nat Acad. 
Sd. USA 92:5072-5076, 1995 which are incorporated herein by ref^reiice). Second, pairs of 
identical peptides tagged with the light and heavy afiSnity tagged reagents, respectively, (or in 
analysis of more than two samples, sets of idmtical tagged peptides in which each set member 
is differentially isotopically labeled) are chemically identical and th^efore serve as mutual 
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intenial standards for accurate quantitation. The MS measurement readily differentiates 
between peptides originating fiom different samples^ representing for example different cell 
states, because of the difference between isotopically distinct reagents attached to the 
pq)tides. The ratios between the intensities of the differing weight components of these pairs 
or sets of peaks provide an accurate measure of the relative abundance of the peptides (and 
hence tho proteins) in the original cell pools because the MS intensity response to a given 
peptide is independent of the isotopic composition of the reagents (De Leenheer, A. P. et al, 
Mass. Spectrom. Rev. 1 1 :249-702, 1992) which are incorporated herein by reference. The use 
of isotopically labeled internal standards is standard practice in quantitative mass spectrometry 
and has been exploited to great advantage in, for example, the precise quantitation of drugs 
and metabolites in bodily fluids. 

The metiiods of this invention, in particular ID gels, can be q)plied to analysis of 
classes of proteins with particular physical-chemical properties including poor solubiUty, large 
or small size and extreme pi values. Low abundance proteins can be analyzed by performing 
protein aflSnity subtraction prior to electrophoresis to remove the most abundant proteins. 
Alternatively, the biotinylation reaction coxild be performed in such a way as to label a minor 
subset of proteins, for example, those proteins exposed on the ouside of a cell, or proteins that 
remain exposed after organelle purification. Because a large amount of non-biotinylated 
protein would then be present that would otherwise interfere with electrophoresis, after 
mixing the proteins fiom the control and e^qperimental together, the protdn preparation could 
be subjected to avidm afiBnity chromatography to enrich for the biotinylated protems, which 
would then be electrophoresed. 

The prototypical {^plication of the chemistry and method of the present invention is 
the establishment of quantitative profOies of complex protein samples and ultimately total 
lysates of cells and tissues following the preferred method described above. In addition the 
reagents and methods of tibis invention have applications which go beyond the determination 
of protein expression profiles. Such apphcations include the following: 

The appUcation of amino-reactive or suUhydryl-reactive, differentially isotopically 
labeled a£Bnity tagg^ reagents can be used for the quantitative analysis of proteins in 
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immunoprecipitated complexes. la the preferred version of this technique protein complexes 
fiom cells representing difEexcxit states (e.g., different stales of activation, different disease 
states, different states of differentiation) are precipitated with a specific reagent, preferably an 
antibody. The proteins in tilxe precipitated complex are then deriyatized and analyzed as 
above. 

The application of amino-reactive, differentially isotopically labeled a£5nity tagged 
reagents can be used to determine flie sites of induced protein phosphorylation. In a preferred 
version of this method purified proteins (e.g., immunoprecipitated &om cells under different 
stimulatory conditions) are firagmented and derivatized as described above. Phosphopeptides 
are identified in the resulting peptide mixture by firagmentation in the ion source of the ESI- 
MS instrument and their relative abundances are detennmed by comparing the ion signal 
intoisities of ttie experimmtal sample with the intensity of an included^ isotopically labeled 
standard. 

Amino-reactive, diffoentially isotopically labeled affinity tagged reagents are used to 
identify the N-terminal ion series in MSMS spectra. In a preferred version of this application, 
the peptides to be analyzed are derivatized with a 50:50 mixture of an isotopically light and 
heavy reagent which is specific for amino gn)iq)s. Fragmentation of the peptides by CID 
therefore produce two N-terminal ion series which differ in mass precisely by die mass 
differential of the reagent species used. This plication dramatically reduces the difBculty in 
determining the amino acid sequence of the derivatized peptide. 

The following examples illustrate four different experiments in which gel 
electrophoresis separations were performed and quantitative data were obtained using ICAT™ 
reagents that contained a biotinyl afiBnity tag, a linker witii eig^t deuterium atoms, and an 
iodoacetamide protein reactive groiq). These examples are not exhaustive and are not 
intended to limit the scope of these experiments, 

EXAMPLES 

Example 1 

Five different standard proteins were alkylated separately with the dO ICAT reagent 
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and the d8 ICAT reagent, and mixed together in different ratios prior to performing 2D gel 
electrophoresis. After staining, the spots corresponding to these proteins were cut out, 
digested with trypsin, and submitted to PMF. Figure 1 shows an image of the gel and insets of 
each mass spectrum that contain one of the ICAT reagent pairs from each protein. In addition, 
the ratio at which the proteins were mixed together prior to gel electrophoresis is listed, as 
well as the ratio of dO to d8 that was obtained by mass spectrometry. In all five cases, the 
discrepancy between the experimental and the observed ratios was well below 20%. 

One of the problematic aspects of separating ICAT reagent labeled peptides by HPLC 
is that the d8 labeled peptide typically elutes several seconds ahead of the corresponding dO 
labeled pq>tide. To demonstrate the &ct that upon gel electrophoresis th^ is no similar 
isotope separation effect, the 2D spot for lactalbumin, shown in Figure 2, was split into 
quadrants, which w^ then separatdfy digested, extracted, and submitted to MALDI MS 
analysis. The right hand side of Figure 2 demonstrates that the same ratio of dO to d8 was 
obtained for each of these quadrants, within 10%. 

Example 2 

E. coli bacteria lysates, either labeled with an ICAT reagent comprising deuterated 
biotinjd iodoacetainide reagent for minimum medium (glucose) growing condition or labeled 
with npn-deuterated reagent for rich medium (LB broth) growing condition, were mixed at 
equal amounts. The mixture was separated through a preparative flow-through SDS gel (5%) 
apparatus (Mim Ftep Cell, Bio-Rad) and proteins were fiactionated into solution. The 
fiactionated proteins were then acetone precipitated before proteolytic digestion by ovemi^t 
incubation with tiypsm. Upon avidin chromatognqphy, peptides from both the flow-through 
portion and the elution portion were collected into 96 fractions. The flow-tim>ugh was 
c^tured on reversed phase medium (POROS® 50R1, AppUed Biosystems) and washed with 
distilled water and eluted with 60% ACN. Samples were vacuum dried and re-suspended with 
50% ACN/0.1% TFA. Spectra were acquired using an AppUed Biosystems Voyager MALDI 
TOF mass spectrometer with a-cyano-4-hydroxycinnamic acid as matrix. The strategy was to 
idmtify proteins using PMF, while the dO / d8 ratio was used for quantitation. 
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Figure 3 shows fhe spectrum acquired for the avidin flow through and for the peptides 
eluted from the avidiii for one firaction that contained protdns at about 40,000 in molecular 
wd^t Ten (10) different ICAT reagent labeled pairs are marked.- The major protein 
components were tentatively identified by PMF using the ChemApplra PMF software 
program (Applied Biosystems), and six components are listed in Table 1 below. OmpA was 
the main component, which comprised 25% of the total intensity. The confidence in the 
identification is roughly proportional to the score listed in column 5. Note that all six of these 
proteins have molecular weights that are between 30K and S2K daltons, as would be ejected 
using crude SDS separation. A special peptide database was created containing cysteine 
peptides only, and the masses &om the eluted peptides w^e searched against Ibis database. 
The top six candidate proteins are listed. Two ofthese proteins are identical to those 
identified firom the avidin flow throu^ Notably, two of the proteins in the flow through 
firaction, namely, ribose binding protem and outer membrane C, have no cysteines, and 
therefore would not contribute any peptides to the avidin eluate firaction. 



Table i 

Flow-Through 



Acc.# 


Protein Name 


MW 


# oeotide 


Score 


% Intensitv 


DPm 




P02990 


EF-TU 


43156 


6 


47828 


25.4 


3.7 




P06996 


ompC 


40344 


8 


13194 


12.7 


7.3 




P00477 


SHT 


45289 


4 


7778 


5.1 


4.2 




P02925 


ribose BP 


30932 


5 


4488 


7.4 


10.7 




P06711 


glutamine syn. 


51741 


4 


4196 


3.4 


7.7 




P08200 


ICDH 


45728 


3 


1174 


1.6 


6.6 




Avidin Elution 














Acc.# 


Protein Name- 


MW 


. ...#-DeDtlde 


Score- 


-%-intensltv_.DDm. . 


ratio 


P02990 


EF-TU 


43156 


5 


17822 


48.2 


3.2 


0.65 


P02934 


ompA 


37179 


2 


305 


2.2 


3.3 


0.67 


P39342 


hypo. 54.3 


54299 


2 


91 


0.6 


10.3 


0.49 


P76200 


hypo. 43 


41368 


2 


44 


1.7 


18.4 


2.6 


P07460 


succcoA syn. 


41368 


2 


21 


1.1 


14.2 


1.5 


P00477 


SHT 


45289 


3 


1 


0.2 


28.3 


0.13 
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The protems listed in Table I were identified fiom tKe spectra in Figure 3 using the 
ChemApplex PMF program. The top panel was obtained £x>ni the flow-through of the avidin 
beads, and the bottom panel was obtained firom the avidin elution* The first column lists the 
SwissProt accession number of the protein that was identified. The second column lists an 
abbreviated form of the protein's name: EF-TU for elongation factor-TU, ompC for outer 
membrane protein C, SHT for serine hydroxymethyl transferase, ribose BP for the periplasmic 
ribose binding protein, glutamine syn. for glutamine synthetase, ICDH for isodtrate 
dehydrogenase, onoipA for outer membrane protein A, hypo, for hypothetical protein, succ. 
CdA syn. for succinjd coenzyme A synthetase beta chain. The MW column lists the molecular 
weight of the protem; # peptide lists the number of pq)tides that were matched (including only 
die dO masses for the avidin eluted peptides); the Score was calculated by the ChemApplex 
program takmg mto account only tiie dO masses; % Intensity is the percentage of the intensity 
of all Ifae masses in the spectrum that could be accounted for by the masses that were matched 
(again only the dO masses); and ppm is the average intensity-weighted ppm error for those 
masses between the experimental measurements from the mass spectrum and the theoretical 
mass of the peptides. Ratio was calculated manually by dividing the intensity of the dO 
peptide by the intensity for the corresponding d8 peptide, and averaging where possible. The 
low intensity of the dO masses for SHT explains why the ChemApplex program had difSculty 
in distinguishing SHT Srom the noise; tihe picgram was not looking for the d8 masses, all three 
ofwhich are detectable over the background. Note that ompC and RBP do not contain 
cysteines, and dierefore are invisible in the avidin eluate fiacticm. The confidence in the 
identifications is highest for the proteins with the highest score, and also for the proteins fiiat 
were independenfiy identified in the flow-tiirough fraction and the afiSnity elution sanq)le. All 
of the proteins in both tables except the two hypothetical proteins in the second table have 
been identified repeatedly from these E. coli lysates. 

Example 3 

Two E coli preparations similar to those described above were labeled with ICAT dO 
reagent and ICAT d8 reagent, mixed together and submitted to ID SDS gel analysis. Slices 
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were cut fiom the gel, washed, digested with trypsin, and the peptides were eluted. No avidin 
afiSnity dm}matogrsq[>hy was performed, so that only the most intense ICAT reagent labeled 
peptides were detectable. Upon PMF analysis with ChemApplex, R coli tryptophanase was 
detectable as the most prominent protein contponeat, after trypsin itself. Under these 
conditions, the peptides that corresponded to ICAT reagent pairs ware also detected in an 
oxidized form, due to oxidation at the original cysteine sulfur atom, analogous to the 
oxidation of methionine residues that is commonly observed post SDS gel analysis. Thus, 
each peptide provides two independent measurements of the ratio of dO to d8, one for the 
reduced form of the peptide, and one for the oxidized form of the pep^do. A prominent 
quartet of peaks about 8 amu apart was detected startmg at 1581.85, which corresponds to the 
toptophanase peptide QLPCPAEIXR (SEQ ID NO: 1), and the dS, dO+O and d8+0 peaks. 
The ICAT reagent pair with an unmodified methionine had a d8/ dO ratio of 2.1, whereas the 
oxidized pair had a d8/ dO ratio of 1.9. In these experiments, the ratios obtained for ICAT 
reagoit pairs of peptides derived fiom the same protein were commonly within 20 % of each 
other, except for the weakest signals and those signals that obviously overlapped other 
peptides (which is particularly apparent when ttiey correspond to expected trypsin digestion 
products fix)m the same proteins ahready identified). Other ICAT reagent pairs from 
tryptophanase were detectable, but not well resolved over the background. 

Example 4 

Proteins were isolated &om rat cardiac cells fiom normal myocytes or fix)m myocytes 
that had been subjected to ischemic conditions. Normal rat protems were labeled with the dO 
ICAT reagent, and the ischemic cell protems were labeled with the d8 ICAT reagent. The two 
sanq)les were mixed together, and run on a 2D gel, and stained with Coomassie brilliant blue. 
Spots were cut out, digested with trypsio, and submitted to PMF. The data were then 
searched using the ChemApplex software program, using a database that consisted of all of 
the human, rat and mouse proteins in the SwissProt database. The top candidate for one spot 
was hunum citrate synthetase. The rat and mouse homologues of citrate synthetase were 
absent &om the database. The peptide mass fingerprint spectrum contained a prominent ICAT 
reagent pair at 1098 that did not correspond to any of the citrate synthetase pq)tides. Because 
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the rat citrate synthetase protein was not preset in the SwissPtot database, a rat EST database 
was searched in the Protein Prospector (University of California -Sail Francisco) software 
program using masses that corresponded exacdy to the theoretical masses of citrate synthetase 
that had bem identified. One of the EST sequences that was identified by this means 
contained the sequence YSQCR (SEQ ID NO: 2), which corresponded to the ICAT reagent 
pair at 1098. The homologous human sequence was YTQCR (SEQ ID NO: 3), explaining the 
measured mass did not match the sequence in the database. This peptide sequence is too short 
to be a unique identifier of a protein, and would not be usefiil had it not been possible to 
assign the peptide to citrate synthetase on the basis of thePMF data. 
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CLAMS: 
We claim: 

1 . A method of comparing protein compositions of interest between at least two 
different samples which comprises: 

(a) preparing an extract of proteins from each of said at least two different samples; 

(b) providing a set of substantially chemically identical and differentially isotopically 
labeled protein reagents, one for each sample wherein said reagent has a formula selected 
fix>m the group consisting of : 

A-L-PRG and L-PRG wherein A is an affinity label that selectively binds to a c^^tive reagent, 
L is a Unker gcoxsp in which one or more atoms are differentially labeled with one or more 
stable isotopes and PRG is a protein reactive groiq> that selectively reacts wilh a given protein 
functional group or is a substrate for an enzyme; 

(c) reacting each protein sample of step (a) with a different reagent fix>m said set of 
step (b) to provide isotopically labeled protems; 

(d) mixing each of said isotopically labeled proteins to form a single mixture of 
different isotopically labeled proteins; 

(e) electrophoresing the mixture of step (d) by an electrophoresing method capable 
of separating proteins within said mixture; and 

(f) detecting the difference in the expression levels of the proteins in the two samples 
by mass spectrometry based on individual pq>tides derived fiom chemical or enzymatic 
digestion. 

2. The method of claim 1 wherein said reagent has the formula: 

A-I^PRG 

and affinity tagged proteins in the samples are enzymatically or chemically processed to 
conv^fhem into labeled peptides. 

3. The method of claim 1 wherein said reagent has the fbrmio^^ 

L-PRG 

and labeled proteins in the samples are enzymatically or chemically processed to convert them 
into labeled peptides. 
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4. The method of any one of claims 1, 2 or 3 wh^in the protein or peptide portion of 
one or more of the labeled proteins are sequenced by tandem mass spectrometry to identify the 
labeled protein from which the peptide originated 

5. The method of any one of claims 1, 2 or 3 who^em flie proteins are identified by 
peptide mass fingerprinting, and the isotopically labeled peptides are used for quantitatioru 

6. The method of any one of claims 1, 2 or 3 in which the amount of one or more 
proteins or peptides in the samples is also determined by mass spectrometry and which fiirther 
comprises the step of introducing into a sample a known amount of one or more internal 
standards fi>r each of the proteins to be quantified. 

7. The method of any one of claims 1, 2 or 3 wherein the released isotopically labeled 
protems or peptides are sqparated by chromatography prior to detecting and detection by mass 
spectrometry. 

8. The method of claims 1, 2 or 3 where the samples consist of protein mixtures 
derived &om tissues, cells, biolo^cal fluids including serum, cerebrospinal fluid, urine, 
ascites, or subcellular fractions including supematants and various membrane-containing 
organelles or nuclear preparations, or protein preparations sqparated by chromatographic 
methods, capillary electrochromatography or capillary electrophoresis methods. 

9. The method of claims 1, 2 or 3 where the proteins are identified by any protein 
staining technique, or where protein-containing regions are localized by mass spectrometry 
following systmatic digestion and extraction or any combination of transblotting and 
digestion. 

10. The method of any one of claims 1, 2 or 3 in which a plurality of proteins or 
peptides in one sanqile are detected and identified. 

.11. The method of any one of claims l,2or3 fiirtfaer comprising a step in which one 
or more of the proteins or peptides in a sanq)le are chemically or CTzymaticaUy processed to 
expose a fijhctional group that can react with a label. 

12. The method of any one of claims 1, 2 or 3 who-ein PRG is a protein reactive 
group that selectively reacts with certain protein fimctional groups and a plurality of proteins 
or peptides are detected and identified in a single sample. 
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13. The mediod of claim 12 wherein two or more substaatially chemically identical 
and differentially isotopically labeled protein leactive reagents having different ^ecificities 
for reaction with proteins or peptides are provided and reacted with each sample to be 
analyzed. 

14. The method of claim 13 wherein all of the proteins or pq)tides in a sample are 
detected and identified. 

15. The method of any one of claims 1, 2 or 3 wherein the relative amounts of one or 
more proteins or peptides in two or more different samples are detennined and which further 
comprises the steps of combining the differentially labeled samples, capturing isotopically 
labeled conq[)onents from the combined sanq)les and measuring the relative abundances of the 
differentially labeled proteins or peptides. 

16. The inethod of claim 1, 2 or 3 which determines the relative amounts of 
membrane proteins in one or more different samples. 

17. The method of claim IS m which different samples contain proteins originating 
firom differmt organelles or different subcellular firactions. 

18. The method of claim 15 in which different samples represent proteins or peptides 
expressed in response to different environmental or nutritional conditions, different chemical 
or physical stimuU or at different times. 

19. The method of claim 1 wherein absolute protein concentration is deduced by 
comparison to a known amount of a deuterated or non-deuterated peptide standard, where 
this standard was derived by chemical synthesis or was isolated from biological samples. 

20. The metiiod of clahn 1 wherry multiple samples are labeled with PRG containing 
different numbers of heavy atoms so that multiple samples can be sq)arated on a single gel 
and analyzed at one time. 

21. The method of claim 1 wherry proteins of special interest that are previously 
known to be particularly informative are analyzed baised on their location on a ID or 2D gel. 
These proteins can include disease markers as well as control proteins. 

22. The method of claim 1 whereby the posl-translational modification status of 
particular proteins are monitored by gel analysis. 
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